Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Imbalanced network traffic classification method based on improved forest rotation algorithm
DING Yaojun
Journal of Computer Applications    2015, 35 (12): 3348-3351.   DOI: 10.11772/j.issn.1001-9081.2015.12.3348
Abstract599)      PDF (611KB)(450)       Save
Aiming at the problem of not high accuracy of the unbalanced network traffic classification, on the basis of rotation forest algorithm, an improved rotation forest algorithm by combining the Bootstrap sampling of Bagging algorithm and the base classifier selection algorithm based on sorting of accuracy was proposed. Firstly, the subset was divided from the original training set according to the characteristics, the Bagging was used for sampling, and the coefficient matrix of principal components was computed by Principal Component Analysis (PCA). Then, features of subset were converted based on the original training set and coefficient matrix of principal components to generate new training subsets. In order to enhance the difference of training set and train base classifier of C4.5 by the training subset, the Bagging was used again for sampling subsets. Finally, the testing set was used to evaluate the base classifiers, and the classifiers were sorted and filtered by the overall classification accuracy.The classifiers with high accuracy were chosen to generate consistent classifier results. The imbalanced network traffic data set was chosen for the test experiment, and the precision and recall were used for evaluating the classifiers of C4.5, Bagging, rotation forest and the improved rotation forest. The time efficiency of the four algorithms were evaluated by the training time and testing time of models. The experimental results show that, the classification accuracy of the improved rotation forest algorithm is above 99.5% on the protocols of World Wide Web (WWW), Mail, Attack, Peer-to-Peer (P2P), and the recall rate is also higher than rotation forest, Bagging and C4.5. The proposed algorithm can be used for network intrusion forensics, maintaining network security and improving the quality of network service.
Reference | Related Articles | Metrics
GOMDI: GPU OpenFlow massive data network analysis model
ZHANG Wei XIE Zhenglong DING Yaojun ZHANG Xiaoxiao
Journal of Computer Applications    2014, 34 (8): 2243-2247.   DOI: 10.11772/j.issn.1001-9081.2014.08.2243
Abstract462)      PDF (840KB)(398)       Save

OpenFlow enhances the Quality of Service (QoS) of traditional networks, but it has disadvantage that its network session identification efficiency is low and the network packet forwarding path is poor and so on. On the basis of the current study of the OpenFlow, GPU OpenFlow Massive Data Network Analysis (GOMDI) model was proposed by this paper, through integrating the biological sequence algorithm, GPU parallel computing algorithm and machine learning methods. The network session matching algorithm and path selection algorithm of GOMDI were designed. The experimental results show that the speedup of the GOMDI network session matching algorithm is over 300 higher than the CPU environment in real network, and the network packet loss rate of its path selection algorithm is lower than 5%, the network delay is less than 20ms. Thus, the GOMDI model can effectively improve network performance and meet the needs of the real-time processing for massive information in big data environment.

Reference | Related Articles | Metrics
Internet traffic classification method based on selective clustering ensemble of mutual information
DING Yaojun CAI Wandong
Journal of Computer Applications    2013, 33 (01): 80-82.   DOI: 10.3724/SP.J.1087.2013.00080
Abstract1088)      PDF (602KB)(619)       Save
Because it is difficult to label Internet traffic and the generalization ability of single clustering algorithm is weak, a selective clustering ensemble method based on Mutual Information (MI) was proposed to improve the accuracy of traffic classification. In the method, the Normalized Mutual Information (NMI) between clustering results of K-means algorithm with different initial cluster number and the distribution of protocol labels of training set was computed first, and then a serial of K which were the initial cluster number of K-means algorithm based on NMI were selected. Finally, the consensus function based on Quadratic Mutual Information (QMI) was used to build the consensus partition, and the labels of clusters were labeled based on a semi-supervised method. The overall accuracies of clustering ensemble method and single clustering algorithm were compared over four testing sets, and the experimental results show that the overall accuracy of clustering ensemble method can achieve 90%. In the proposed method, a clustering ensemble model was used to classify Internet traffic, and the overall accuracy of traffic classification along with the stability of classification over different dataset got enhanced.
Reference | Related Articles | Metrics